Multi-Cloud Workflow with Pangeo¶
This example demonstrates a workflow using analysis-ready data provided in two public clouds.
LENS (Hosted on AWS in the us-west-2 region)
ERA5 (Hosted on Google Cloud Platform in multiple regions)
We’ll perform a similar analysis on each of the datasets, a histogram of the total precipitation, compare the results. Notably, this computation reduces a large dataset to a small summary. The reduction can happen on a cluster in the cloud.
By placing a compute cluster in the cloud next to the data, we avoid moving large amounts of data over the public internet. The large analysis-ready data only needs to move within a cloud region: from the machines storing the data in an object-store like S3 to the machines performing the analysis. The compute cluster reduces the large amount of data to a small histogram summary. At just a handful of KBs, the summary statistics can easily be moved back to the local client, which might be running on a laptop. This also avoids costly egress charges from moving large amounts of data out of cloud regions.
import getpass
import dask
from distributed import Client
from dask_gateway import Gateway, BasicAuth
import intake
import numpy as np
import s3fs
import xarray as xr
from xhistogram.xarray import histogram
Create Dask Clusters¶
We’ve deployed Dask Gateway on two Kubernetes clusters, one in AWS and one in GCP. We’ll use these to create Dask clusters in the same cloud region as the data. We’ll connect to both of them from the same interactive notebook session.
password = getpass.getpass()
auth = BasicAuth("pangeo", password)
····
# Create a Dask Cluster on AWS
aws_gateway = Gateway(
"http://a00670d37945911eab47102a1da71b1b-524946043.us-west-2.elb.amazonaws.com",
auth=auth,
)
aws = aws_gateway.new_cluster()
aws_client = Client(aws, set_as_default=False)
aws_client
Client
|
Cluster
|
# Create a Dask Cluster on GCP
gcp_gateway = Gateway(
"http://34.72.56.89",
auth=auth,
)
gcp = gcp_gateway.new_cluster()
gcp_client = Client(gcp, set_as_default=False)
gcp_client
Client
|
Cluster
|
We’ll enable adaptive mode on each of the Dask clusters. Workers will be added and removed as needed by the current level of computation.
aws.adapt(minimum=1, maximum=200)
gcp.adapt(minimum=1, maximum=200)
ERA5 on Google Cloud Storage¶
We’ll use intake and pangeo’s data catalog to discover the dataset.
cat = intake.open_catalog(
"https://raw.githubusercontent.com/pangeo-data/pangeo-datastore/master/intake-catalogs/master.yaml"
)
cat
<Intake catalog: master>
The next cell loads the metadata as an xarray dataset. No large amount of data is read or transfered here. It will be loaded on-demand when we ask for a concrete result later.
era5 = cat.atmosphere.era5_hourly_reanalysis_single_levels_sa(
storage_options={"requester_pays": False, "token": "anon"}
).to_dask()
era5
- latitude: 721
- longitude: 1440
- time: 350640
- latitude(latitude)float3290.0 89.75 89.5 ... -89.75 -90.0
- long_name :
- latitude
- units :
- degrees_north
array([ 90. , 89.75, 89.5 , ..., -89.5 , -89.75, -90. ], dtype=float32)
- longitude(longitude)float320.0 0.25 0.5 ... 359.5 359.75
- long_name :
- longitude
- units :
- degrees_east
array([0.0000e+00, 2.5000e-01, 5.0000e-01, ..., 3.5925e+02, 3.5950e+02, 3.5975e+02], dtype=float32) - time(time)datetime64[ns]1979-01-01 ... 2018-12-31T23:00:00
- long_name :
- time
array(['1979-01-01T00:00:00.000000000', '1979-01-01T01:00:00.000000000', '1979-01-01T02:00:00.000000000', ..., '2018-12-31T21:00:00.000000000', '2018-12-31T22:00:00.000000000', '2018-12-31T23:00:00.000000000'], dtype='datetime64[ns]')
- asn(time, latitude, longitude)float32dask.array<chunksize=(31, 721, 1440), meta=np.ndarray>
- long_name :
- Snow albedo
- units :
- (0 - 1)
Array Chunk Bytes 1.46 TB 128.74 MB Shape (350640, 721, 1440) (31, 721, 1440) Count 11312 Tasks 11311 Chunks Type float32 numpy.ndarray - d2m(time, latitude, longitude)float32dask.array<chunksize=(31, 721, 1440), meta=np.ndarray>
- long_name :
- 2 metre dewpoint temperature
- units :
- K
Array Chunk Bytes 1.46 TB 128.74 MB Shape (350640, 721, 1440) (31, 721, 1440) Count 11312 Tasks 11311 Chunks Type float32 numpy.ndarray - e(time, latitude, longitude)float32dask.array<chunksize=(31, 721, 1440), meta=np.ndarray>
- long_name :
- Evaporation
- standard_name :
- lwe_thickness_of_water_evaporation_amount
- units :
- m of water equivalent
Array Chunk Bytes 1.46 TB 128.74 MB Shape (350640, 721, 1440) (31, 721, 1440) Count 11312 Tasks 11311 Chunks Type float32 numpy.ndarray - mn2t(time, latitude, longitude)float32dask.array<chunksize=(31, 721, 1440), meta=np.ndarray>
- long_name :
- Minimum temperature at 2 metres since previous post-processing
- units :
- K
Array Chunk Bytes 1.46 TB 128.74 MB Shape (350640, 721, 1440) (31, 721, 1440) Count 11312 Tasks 11311 Chunks Type float32 numpy.ndarray - mx2t(time, latitude, longitude)float32dask.array<chunksize=(31, 721, 1440), meta=np.ndarray>
- long_name :
- Maximum temperature at 2 metres since previous post-processing
- units :
- K
Array Chunk Bytes 1.46 TB 128.74 MB Shape (350640, 721, 1440) (31, 721, 1440) Count 11312 Tasks 11311 Chunks Type float32 numpy.ndarray - ptype(time, latitude, longitude)float32dask.array<chunksize=(31, 721, 1440), meta=np.ndarray>
- long_name :
- Precipitation type
- units :
- code table (4.201)
Array Chunk Bytes 1.46 TB 128.74 MB Shape (350640, 721, 1440) (31, 721, 1440) Count 11312 Tasks 11311 Chunks Type float32 numpy.ndarray - ro(time, latitude, longitude)float32dask.array<chunksize=(31, 721, 1440), meta=np.ndarray>
- long_name :
- Runoff
- units :
- m
Array Chunk Bytes 1.46 TB 128.74 MB Shape (350640, 721, 1440) (31, 721, 1440) Count 11312 Tasks 11311 Chunks Type float32 numpy.ndarray - sd(time, latitude, longitude)float32dask.array<chunksize=(31, 721, 1440), meta=np.ndarray>
- long_name :
- Snow depth
- standard_name :
- lwe_thickness_of_surface_snow_amount
- units :
- m of water equivalent
Array Chunk Bytes 1.46 TB 128.74 MB Shape (350640, 721, 1440) (31, 721, 1440) Count 11312 Tasks 11311 Chunks Type float32 numpy.ndarray - sro(time, latitude, longitude)float32dask.array<chunksize=(31, 721, 1440), meta=np.ndarray>
- long_name :
- Surface runoff
- units :
- m
Array Chunk Bytes 1.46 TB 128.74 MB Shape (350640, 721, 1440) (31, 721, 1440) Count 11312 Tasks 11311 Chunks Type float32 numpy.ndarray - ssr(time, latitude, longitude)float32dask.array<chunksize=(31, 721, 1440), meta=np.ndarray>
- long_name :
- Surface net solar radiation
- standard_name :
- surface_net_downward_shortwave_flux
- units :
- J m**-2
Array Chunk Bytes 1.46 TB 128.74 MB Shape (350640, 721, 1440) (31, 721, 1440) Count 11312 Tasks 11311 Chunks Type float32 numpy.ndarray - t2m(time, latitude, longitude)float32dask.array<chunksize=(31, 721, 1440), meta=np.ndarray>
- long_name :
- 2 metre temperature
- units :
- K
Array Chunk Bytes 1.46 TB 128.74 MB Shape (350640, 721, 1440) (31, 721, 1440) Count 11312 Tasks 11311 Chunks Type float32 numpy.ndarray - tcc(time, latitude, longitude)float32dask.array<chunksize=(31, 721, 1440), meta=np.ndarray>
- long_name :
- Total cloud cover
- standard_name :
- cloud_area_fraction
- units :
- (0 - 1)
Array Chunk Bytes 1.46 TB 128.74 MB Shape (350640, 721, 1440) (31, 721, 1440) Count 11312 Tasks 11311 Chunks Type float32 numpy.ndarray - tcrw(time, latitude, longitude)float32dask.array<chunksize=(31, 721, 1440), meta=np.ndarray>
- long_name :
- Total column rain water
- units :
- kg m**-2
Array Chunk Bytes 1.46 TB 128.74 MB Shape (350640, 721, 1440) (31, 721, 1440) Count 11312 Tasks 11311 Chunks Type float32 numpy.ndarray - tp(time, latitude, longitude)float32dask.array<chunksize=(31, 721, 1440), meta=np.ndarray>
- long_name :
- Total precipitation
- units :
- m
Array Chunk Bytes 1.46 TB 128.74 MB Shape (350640, 721, 1440) (31, 721, 1440) Count 11312 Tasks 11311 Chunks Type float32 numpy.ndarray - tsn(time, latitude, longitude)float32dask.array<chunksize=(31, 721, 1440), meta=np.ndarray>
- long_name :
- Temperature of snow layer
- standard_name :
- temperature_in_surface_snow
- units :
- K
Array Chunk Bytes 1.46 TB 128.74 MB Shape (350640, 721, 1440) (31, 721, 1440) Count 11312 Tasks 11311 Chunks Type float32 numpy.ndarray - u10(time, latitude, longitude)float32dask.array<chunksize=(31, 721, 1440), meta=np.ndarray>
- long_name :
- 10 metre U wind component
- units :
- m s**-1
Array Chunk Bytes 1.46 TB 128.74 MB Shape (350640, 721, 1440) (31, 721, 1440) Count 11312 Tasks 11311 Chunks Type float32 numpy.ndarray - v10(time, latitude, longitude)float32dask.array<chunksize=(31, 721, 1440), meta=np.ndarray>
- long_name :
- 10 metre V wind component
- units :
- m s**-1
Array Chunk Bytes 1.46 TB 128.74 MB Shape (350640, 721, 1440) (31, 721, 1440) Count 11312 Tasks 11311 Chunks Type float32 numpy.ndarray
- Conventions :
- CF-1.6
- history :
- 2019-09-20 05:15:01 GMT by grib_to_netcdf-2.10.0: /opt/ecmwf/eccodes/bin/grib_to_netcdf -o /cache/data7/adaptor.mars.internal-1568954670.105603-18230-3-5ca6e0df-a562-42ba-8b0b-948b2e1815bd.nc /cache/tmp/5ca6e0df-a562-42ba-8b0b-948b2e1815bd-adaptor.mars.internal-1568954670.1062171-18230-2-tmp.grib
We’re computing the histogram on the total precipitation for a specific time period. xarray makes selecting this subset of data quite natural. Again, we still haven’t loaded the data.
tp = era5.tp.sel(time=slice('1990-01-01', '2005-12-31'))
tp
- time: 140256
- latitude: 721
- longitude: 1440
- dask.array<chunksize=(9, 721, 1440), meta=np.ndarray>
Array Chunk Bytes 582.48 GB 128.74 MB Shape (140256, 721, 1440) (31, 721, 1440) Count 15838 Tasks 4526 Chunks Type float32 numpy.ndarray - latitude(latitude)float3290.0 89.75 89.5 ... -89.75 -90.0
- long_name :
- latitude
- units :
- degrees_north
array([ 90. , 89.75, 89.5 , ..., -89.5 , -89.75, -90. ], dtype=float32)
- longitude(longitude)float320.0 0.25 0.5 ... 359.5 359.75
- long_name :
- longitude
- units :
- degrees_east
array([0.0000e+00, 2.5000e-01, 5.0000e-01, ..., 3.5925e+02, 3.5950e+02, 3.5975e+02], dtype=float32) - time(time)datetime64[ns]1990-01-01 ... 2005-12-31T23:00:00
- long_name :
- time
array(['1990-01-01T00:00:00.000000000', '1990-01-01T01:00:00.000000000', '1990-01-01T02:00:00.000000000', ..., '2005-12-31T21:00:00.000000000', '2005-12-31T22:00:00.000000000', '2005-12-31T23:00:00.000000000'], dtype='datetime64[ns]')
- long_name :
- Total precipitation
- units :
- m
To compare to the 6-hourly LENS dataset, we’ll aggregate to 6-hourly totals.
# convert to 6-hourly precip totals
tp_6hr = tp.coarsen(time=6).sum()
tp_6hr
- time: 23376
- latitude: 721
- longitude: 1440
- dask.array<chunksize=(1, 721, 1440), meta=np.ndarray>
Array Chunk Bytes 97.08 GB 24.92 MB Shape (23376, 721, 1440) (6, 721, 1440) Count 50536 Tasks 4526 Chunks Type float32 numpy.ndarray - latitude(latitude)float3290.0 89.75 89.5 ... -89.75 -90.0
- long_name :
- latitude
- units :
- degrees_north
array([ 90. , 89.75, 89.5 , ..., -89.5 , -89.75, -90. ], dtype=float32)
- longitude(longitude)float320.0 0.25 0.5 ... 359.5 359.75
- long_name :
- longitude
- units :
- degrees_east
array([0.0000e+00, 2.5000e-01, 5.0000e-01, ..., 3.5925e+02, 3.5950e+02, 3.5975e+02], dtype=float32) - time(time)datetime64[ns]1990-01-01T02:30:00 ... 2005-12-31T20:30:00
array(['1990-01-01T02:30:00.000000000', '1990-01-01T08:30:00.000000000', '1990-01-01T14:30:00.000000000', ..., '2005-12-31T08:30:00.000000000', '2005-12-31T14:30:00.000000000', '2005-12-31T20:30:00.000000000'], dtype='datetime64[ns]')
We’ll bin this data into the following bins.
tp_6hr_bins = np.concatenate([[0], np.logspace(-5, 0, 50)])
tp_6hr_bins
array([0.00000000e+00, 1.00000000e-05, 1.26485522e-05, 1.59985872e-05,
2.02358965e-05, 2.55954792e-05, 3.23745754e-05, 4.09491506e-05,
5.17947468e-05, 6.55128557e-05, 8.28642773e-05, 1.04811313e-04,
1.32571137e-04, 1.67683294e-04, 2.12095089e-04, 2.68269580e-04,
3.39322177e-04, 4.29193426e-04, 5.42867544e-04, 6.86648845e-04,
8.68511374e-04, 1.09854114e-03, 1.38949549e-03, 1.75751062e-03,
2.22299648e-03, 2.81176870e-03, 3.55648031e-03, 4.49843267e-03,
5.68986603e-03, 7.19685673e-03, 9.10298178e-03, 1.15139540e-02,
1.45634848e-02, 1.84206997e-02, 2.32995181e-02, 2.94705170e-02,
3.72759372e-02, 4.71486636e-02, 5.96362332e-02, 7.54312006e-02,
9.54095476e-02, 1.20679264e-01, 1.52641797e-01, 1.93069773e-01,
2.44205309e-01, 3.08884360e-01, 3.90693994e-01, 4.94171336e-01,
6.25055193e-01, 7.90604321e-01, 1.00000000e+00])
The next cell applies the histogram to the longitude dimension and takes the mean over time.
We’re still just building up the computation here, we haven’t actually loaded the data or executed it yet.
tp_hist = histogram(
tp_6hr.rename('tp_6hr'), bins=[tp_6hr_bins], dim=['longitude']
).mean(dim='time')
tp_hist.data
|
In total, we’re going from the ~1.5TB raw dataset down to a small 288 kB result that is the histogram summarizing the total precipitation. We’ve built up a large sequence of operations to do that reduction (over 110,000 individual tasks), and now it’s time to actually execute it. There will be some delay between running the next cell, the scheduler receiving the task graph, and the cluster starting to process it, but work is happening in the background. After a minute or so, tasks will start appearing on the Dask dashboard.
One thing to note: we request this result with the gcp_client, the client for the cluster in the same cloud region as the data.
era5_tp_hist_ = gcp_client.compute(tp_hist, retries=5)
gcp_tp_hist_ is a Future pointing to the result on the cluster. The actual computation is happening in the background, and we’ll call .result() to get the concrete result later on.
era5_tp_hist_
Because the Dask cluster is in adaptive mode, this computation has kicked off a chain of events: Dask has noticed that it suddenly has many tasks to compute, so it asks the cluster manager (Kubernetes in this case) for more workers. THe Kubernetes cluster then asks it’s compute backend (Google Compute Engine in this case) for more virtual machines. As these machines come online, our workers will come to life and the cluster will start progressing on our computation.
LENS on AWS¶
This computation is very similar to the ERA5 computation. The primary difference is that the LENS dataset is an ensemble. We’ll histogram a single member of that ensemble.
The Intake catalog created by NCAR includes many things, so we’ll use intake-esm to search for the URL we want.
col = intake.open_esm_datastore(
"https://raw.githubusercontent.com/NCAR/cesm-lens-aws/master/intake-catalogs/aws-cesm1-le.json"
)
res = col.search(frequency='hourly6-1990-2005', variable='PRECT')
res.df
| component | frequency | experiment | variable | path | |
|---|---|---|---|---|---|
| 0 | atm | hourly6-1990-2005 | 20C | PRECT | s3://ncar-cesm-lens/atm/hourly6-1990-2005/cesm... |
url = res.df.loc[0, "path"]
url
's3://ncar-cesm-lens/atm/hourly6-1990-2005/cesmLE-20C-PRECT.zarr'
We’ll (lazily) load that data from S3 using s3fs, xarray, and zarr.
fs = s3fs.S3FileSystem(anon=True)
lens = xr.open_zarr(fs.get_mapper(url), consolidated=True)
lens
- ilev: 31
- lat: 192
- lev: 30
- lon: 288
- member_id: 36
- nbnd: 2
- slat: 191
- slon: 288
- time: 23360
- ilev(ilev)float642.255 5.032 10.16 ... 985.1 1e+03
- formula_terms :
- a: hyai b: hybi p0: P0 ps: PS
- long_name :
- hybrid level at interfaces (1000*(A+B))
- positive :
- down
- standard_name :
- atmosphere_hybrid_sigma_pressure_coordinate
- units :
- level
array([ 2.25524 , 5.031692, 10.157947, 18.555317, 30.669123, 45.867477, 63.323483, 80.701418, 94.941042, 111.693211, 131.401271, 154.586807, 181.863353, 213.952821, 251.704417, 296.117216, 348.366588, 409.835219, 482.149929, 567.224421, 652.332969, 730.445892, 796.363071, 845.353667, 873.715866, 900.324631, 924.964462, 947.432335, 967.538625, 985.11219 , 1000. ]) - lat(lat)float64-90.0 -89.06 -88.12 ... 89.06 90.0
- long_name :
- latitude
- units :
- degrees_north
array([-90. , -89.057592, -88.115183, -87.172775, -86.230366, -85.287958, -84.34555 , -83.403141, -82.460733, -81.518325, -80.575916, -79.633508, -78.691099, -77.748691, -76.806283, -75.863874, -74.921466, -73.979058, -73.036649, -72.094241, -71.151832, -70.209424, -69.267016, -68.324607, -67.382199, -66.439791, -65.497382, -64.554974, -63.612565, -62.670157, -61.727749, -60.78534 , -59.842932, -58.900524, -57.958115, -57.015707, -56.073298, -55.13089 , -54.188482, -53.246073, -52.303665, -51.361257, -50.418848, -49.47644 , -48.534031, -47.591623, -46.649215, -45.706806, -44.764398, -43.82199 , -42.879581, -41.937173, -40.994764, -40.052356, -39.109948, -38.167539, -37.225131, -36.282723, -35.340314, -34.397906, -33.455497, -32.513089, -31.570681, -30.628272, -29.685864, -28.743455, -27.801047, -26.858639, -25.91623 , -24.973822, -24.031414, -23.089005, -22.146597, -21.204188, -20.26178 , -19.319372, -18.376963, -17.434555, -16.492147, -15.549738, -14.60733 , -13.664921, -12.722513, -11.780105, -10.837696, -9.895288, -8.95288 , -8.010471, -7.068063, -6.125654, -5.183246, -4.240838, -3.298429, -2.356021, -1.413613, -0.471204, 0.471204, 1.413613, 2.356021, 3.298429, 4.240838, 5.183246, 6.125654, 7.068063, 8.010471, 8.95288 , 9.895288, 10.837696, 11.780105, 12.722513, 13.664921, 14.60733 , 15.549738, 16.492147, 17.434555, 18.376963, 19.319372, 20.26178 , 21.204188, 22.146597, 23.089005, 24.031414, 24.973822, 25.91623 , 26.858639, 27.801047, 28.743455, 29.685864, 30.628272, 31.570681, 32.513089, 33.455497, 34.397906, 35.340314, 36.282723, 37.225131, 38.167539, 39.109948, 40.052356, 40.994764, 41.937173, 42.879581, 43.82199 , 44.764398, 45.706806, 46.649215, 47.591623, 48.534031, 49.47644 , 50.418848, 51.361257, 52.303665, 53.246073, 54.188482, 55.13089 , 56.073298, 57.015707, 57.958115, 58.900524, 59.842932, 60.78534 , 61.727749, 62.670157, 63.612565, 64.554974, 65.497382, 66.439791, 67.382199, 68.324607, 69.267016, 70.209424, 71.151832, 72.094241, 73.036649, 73.979058, 74.921466, 75.863874, 76.806283, 77.748691, 78.691099, 79.633508, 80.575916, 81.518325, 82.460733, 83.403141, 84.34555 , 85.287958, 86.230366, 87.172775, 88.115183, 89.057592, 90. ]) - lev(lev)float643.643 7.595 14.36 ... 976.3 992.6
- formula_terms :
- a: hyam b: hybm p0: P0 ps: PS
- long_name :
- hybrid level at midpoints (1000*(A+B))
- positive :
- down
- standard_name :
- atmosphere_hybrid_sigma_pressure_coordinate
- units :
- level
array([ 3.643466, 7.59482 , 14.356632, 24.61222 , 38.2683 , 54.59548 , 72.012451, 87.82123 , 103.317127, 121.547241, 142.994039, 168.22508 , 197.908087, 232.828619, 273.910817, 322.241902, 379.100904, 445.992574, 524.687175, 609.778695, 691.38943 , 763.404481, 820.858369, 859.534767, 887.020249, 912.644547, 936.198398, 957.48548 , 976.325407, 992.556095]) - lon(lon)float640.0 1.25 2.5 ... 356.2 357.5 358.8
- long_name :
- longitude
- units :
- degrees_east
array([ 0. , 1.25, 2.5 , ..., 356.25, 357.5 , 358.75])
- member_id(member_id)int641 2 3 4 5 6 ... 31 32 33 34 35 104
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 104]) - slat(slat)float64-89.53 -88.59 ... 88.59 89.53
- long_name :
- staggered latitude
- units :
- degrees_north
array([-89.528796, -88.586387, -87.643979, -86.701571, -85.759162, -84.816754, -83.874346, -82.931937, -81.989529, -81.04712 , -80.104712, -79.162304, -78.219895, -77.277487, -76.335079, -75.39267 , -74.450262, -73.507853, -72.565445, -71.623037, -70.680628, -69.73822 , -68.795812, -67.853403, -66.910995, -65.968586, -65.026178, -64.08377 , -63.141361, -62.198953, -61.256545, -60.314136, -59.371728, -58.429319, -57.486911, -56.544503, -55.602094, -54.659686, -53.717277, -52.774869, -51.832461, -50.890052, -49.947644, -49.005236, -48.062827, -47.120419, -46.17801 , -45.235602, -44.293194, -43.350785, -42.408377, -41.465969, -40.52356 , -39.581152, -38.638743, -37.696335, -36.753927, -35.811518, -34.86911 , -33.926702, -32.984293, -32.041885, -31.099476, -30.157068, -29.21466 , -28.272251, -27.329843, -26.387435, -25.445026, -24.502618, -23.560209, -22.617801, -21.675393, -20.732984, -19.790576, -18.848168, -17.905759, -16.963351, -16.020942, -15.078534, -14.136126, -13.193717, -12.251309, -11.308901, -10.366492, -9.424084, -8.481675, -7.539267, -6.596859, -5.65445 , -4.712042, -3.769634, -2.827225, -1.884817, -0.942408, 0. , 0.942408, 1.884817, 2.827225, 3.769634, 4.712042, 5.65445 , 6.596859, 7.539267, 8.481675, 9.424084, 10.366492, 11.308901, 12.251309, 13.193717, 14.136126, 15.078534, 16.020942, 16.963351, 17.905759, 18.848168, 19.790576, 20.732984, 21.675393, 22.617801, 23.560209, 24.502618, 25.445026, 26.387435, 27.329843, 28.272251, 29.21466 , 30.157068, 31.099476, 32.041885, 32.984293, 33.926702, 34.86911 , 35.811518, 36.753927, 37.696335, 38.638743, 39.581152, 40.52356 , 41.465969, 42.408377, 43.350785, 44.293194, 45.235602, 46.17801 , 47.120419, 48.062827, 49.005236, 49.947644, 50.890052, 51.832461, 52.774869, 53.717277, 54.659686, 55.602094, 56.544503, 57.486911, 58.429319, 59.371728, 60.314136, 61.256545, 62.198953, 63.141361, 64.08377 , 65.026178, 65.968586, 66.910995, 67.853403, 68.795812, 69.73822 , 70.680628, 71.623037, 72.565445, 73.507853, 74.450262, 75.39267 , 76.335079, 77.277487, 78.219895, 79.162304, 80.104712, 81.04712 , 81.989529, 82.931937, 83.874346, 84.816754, 85.759162, 86.701571, 87.643979, 88.586387, 89.528796]) - slon(slon)float64-0.625 0.625 1.875 ... 356.9 358.1
- long_name :
- staggered longitude
- units :
- degrees_east
array([ -0.625, 0.625, 1.875, ..., 355.625, 356.875, 358.125])
- time(time)object1990-01-01 06:00:00 ... 2006-01-01 00:00:00
- bounds :
- time_bnds
- long_name :
- time
array([cftime.DatetimeNoLeap(1990-01-01 06:00:00), cftime.DatetimeNoLeap(1990-01-01 12:00:00), cftime.DatetimeNoLeap(1990-01-01 18:00:00), ..., cftime.DatetimeNoLeap(2005-12-31 12:00:00), cftime.DatetimeNoLeap(2005-12-31 18:00:00), cftime.DatetimeNoLeap(2006-01-01 00:00:00)], dtype=object)
- P0()float64...
- long_name :
- reference pressure
- units :
- Pa
array(100000.)
- PRECT(member_id, time, lat, lon)float32dask.array<chunksize=(2, 504, 192, 288), meta=np.ndarray>
- cell_methods :
- time: mean
- long_name :
- Total (convective and large-scale) precipitation rate (liq + ice)
- units :
- m/s
Array Chunk Bytes 186.01 GB 222.95 MB Shape (36, 23360, 192, 288) (2, 504, 192, 288) Count 847 Tasks 846 Chunks Type float32 numpy.ndarray - area(lat, lon)float32dask.array<chunksize=(192, 288), meta=np.ndarray>
- long_name :
- Grid-Cell Area
- standard_name :
- cell_area
- units :
- m2
Array Chunk Bytes 221.18 kB 221.18 kB Shape (192, 288) (192, 288) Count 2 Tasks 1 Chunks Type float32 numpy.ndarray - ch4vmr(time)float64dask.array<chunksize=(504,), meta=np.ndarray>
- long_name :
- ch4 volume mixing ratio
Array Chunk Bytes 186.88 kB 4.03 kB Shape (23360,) (504,) Count 48 Tasks 47 Chunks Type float64 numpy.ndarray - co2vmr(time)float64dask.array<chunksize=(504,), meta=np.ndarray>
- long_name :
- co2 volume mixing ratio
Array Chunk Bytes 186.88 kB 4.03 kB Shape (23360,) (504,) Count 48 Tasks 47 Chunks Type float64 numpy.ndarray - date(time)int32dask.array<chunksize=(504,), meta=np.ndarray>
- long_name :
- current date (YYYYMMDD)
Array Chunk Bytes 93.44 kB 2.02 kB Shape (23360,) (504,) Count 48 Tasks 47 Chunks Type int32 numpy.ndarray - date_written(time)|S8dask.array<chunksize=(504,), meta=np.ndarray>
Array Chunk Bytes 186.88 kB 4.03 kB Shape (23360,) (504,) Count 48 Tasks 47 Chunks Type |S8 numpy.ndarray - datesec(time)int32dask.array<chunksize=(504,), meta=np.ndarray>
- long_name :
- current seconds of current date
Array Chunk Bytes 93.44 kB 2.02 kB Shape (23360,) (504,) Count 48 Tasks 47 Chunks Type int32 numpy.ndarray - f11vmr(time)float64dask.array<chunksize=(504,), meta=np.ndarray>
- long_name :
- f11 volume mixing ratio
Array Chunk Bytes 186.88 kB 4.03 kB Shape (23360,) (504,) Count 48 Tasks 47 Chunks Type float64 numpy.ndarray - f12vmr(time)float64dask.array<chunksize=(504,), meta=np.ndarray>
- long_name :
- f12 volume mixing ratio
Array Chunk Bytes 186.88 kB 4.03 kB Shape (23360,) (504,) Count 48 Tasks 47 Chunks Type float64 numpy.ndarray - gw(lat)float64dask.array<chunksize=(192,), meta=np.ndarray>
- long_name :
- gauss weights
Array Chunk Bytes 1.54 kB 1.54 kB Shape (192,) (192,) Count 2 Tasks 1 Chunks Type float64 numpy.ndarray - hyai(ilev)float64dask.array<chunksize=(31,), meta=np.ndarray>
- long_name :
- hybrid A coefficient at layer interfaces
Array Chunk Bytes 248 B 248 B Shape (31,) (31,) Count 2 Tasks 1 Chunks Type float64 numpy.ndarray - hyam(lev)float64dask.array<chunksize=(30,), meta=np.ndarray>
- long_name :
- hybrid A coefficient at layer midpoints
Array Chunk Bytes 240 B 240 B Shape (30,) (30,) Count 2 Tasks 1 Chunks Type float64 numpy.ndarray - hybi(ilev)float64dask.array<chunksize=(31,), meta=np.ndarray>
- long_name :
- hybrid B coefficient at layer interfaces
Array Chunk Bytes 248 B 248 B Shape (31,) (31,) Count 2 Tasks 1 Chunks Type float64 numpy.ndarray - hybm(lev)float64dask.array<chunksize=(30,), meta=np.ndarray>
- long_name :
- hybrid B coefficient at layer midpoints
Array Chunk Bytes 240 B 240 B Shape (30,) (30,) Count 2 Tasks 1 Chunks Type float64 numpy.ndarray - mdt()int32...
- long_name :
- timestep
- units :
- s
array(1800, dtype=int32)
- n2ovmr(time)float64dask.array<chunksize=(504,), meta=np.ndarray>
- long_name :
- n2o volume mixing ratio
Array Chunk Bytes 186.88 kB 4.03 kB Shape (23360,) (504,) Count 48 Tasks 47 Chunks Type float64 numpy.ndarray - nbdate()int32...
- long_name :
- base date (YYYYMMDD)
array(18500101, dtype=int32)
- nbsec()int32...
- long_name :
- seconds of base date
array(0, dtype=int32)
- ndbase()int32...
- long_name :
- base day
array(0, dtype=int32)
- ndcur(time)int32dask.array<chunksize=(504,), meta=np.ndarray>
- long_name :
- current day (from base day)
Array Chunk Bytes 93.44 kB 2.02 kB Shape (23360,) (504,) Count 48 Tasks 47 Chunks Type int32 numpy.ndarray - nlon(lat)int32dask.array<chunksize=(192,), meta=np.ndarray>
- long_name :
- number of longitudes
Array Chunk Bytes 768 B 768 B Shape (192,) (192,) Count 2 Tasks 1 Chunks Type int32 numpy.ndarray - nsbase()int32...
- long_name :
- seconds of base day
array(0, dtype=int32)
- nscur(time)int32dask.array<chunksize=(504,), meta=np.ndarray>
- long_name :
- current seconds of current day
Array Chunk Bytes 93.44 kB 2.02 kB Shape (23360,) (504,) Count 48 Tasks 47 Chunks Type int32 numpy.ndarray - nsteph(time)int32dask.array<chunksize=(504,), meta=np.ndarray>
- long_name :
- current timestep
Array Chunk Bytes 93.44 kB 2.02 kB Shape (23360,) (504,) Count 48 Tasks 47 Chunks Type int32 numpy.ndarray - ntrk()int32...
- long_name :
- spectral truncation parameter K
array(1, dtype=int32)
- ntrm()int32...
- long_name :
- spectral truncation parameter M
array(1, dtype=int32)
- ntrn()int32...
- long_name :
- spectral truncation parameter N
array(1, dtype=int32)
- sol_tsi(time)float64dask.array<chunksize=(504,), meta=np.ndarray>
- long_name :
- total solar irradiance
- units :
- W/m2
Array Chunk Bytes 186.88 kB 4.03 kB Shape (23360,) (504,) Count 48 Tasks 47 Chunks Type float64 numpy.ndarray - time_bnds(time, nbnd)objectdask.array<chunksize=(11680, 2), meta=np.ndarray>
- long_name :
- time interval endpoints
Array Chunk Bytes 373.76 kB 186.88 kB Shape (23360, 2) (11680, 2) Count 3 Tasks 2 Chunks Type object numpy.ndarray - time_written(time)|S8dask.array<chunksize=(504,), meta=np.ndarray>
Array Chunk Bytes 186.88 kB 4.03 kB Shape (23360,) (504,) Count 48 Tasks 47 Chunks Type |S8 numpy.ndarray - w_stag(slat)float64dask.array<chunksize=(191,), meta=np.ndarray>
- long_name :
- staggered latitude weights
Array Chunk Bytes 1.53 kB 1.53 kB Shape (191,) (191,) Count 2 Tasks 1 Chunks Type float64 numpy.ndarray - wnummax(lat)int32dask.array<chunksize=(192,), meta=np.ndarray>
- long_name :
- cutoff Fourier wavenumber
Array Chunk Bytes 768 B 768 B Shape (192,) (192,) Count 2 Tasks 1 Chunks Type int32 numpy.ndarray
- Conventions :
- CF-1.0
- NCO :
- 4.3.4
- Version :
- $Name$
- history :
- 2019-08-01 00:15:18.487461 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.001.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:19.080785 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.002.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:20.252396 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.003.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:20.787281 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.004.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:21.279874 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.005.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:21.850205 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.006.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:22.423595 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.007.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:23.127816 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.008.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:23.695110 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.009.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:24.291352 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.010.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:24.873420 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.011.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:25.512516 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.012.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:26.061289 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.013.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:26.662665 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.014.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:27.243923 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.015.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:27.799712 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.016.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:28.350833 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.017.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:28.882690 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.018.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:29.612376 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.019.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:30.142923 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.020.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:32.677487 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.021.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:33.314355 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.022.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:35.416995 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.023.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:37.400624 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.024.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:40.313590 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.025.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:42.594527 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.026.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:44.729537 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.027.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:46.637571 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.028.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:48.589381 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.029.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:50.705311 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.030.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:51.206031 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.031.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:51.683246 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.032.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:52.156426 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.033.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:54.220732 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.034.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:56.793005 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.035.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:15:58.913802 xarray.open_dataset('/glade/collections/cdg/data/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/b.e11.B20TRC5CNBDRD.f09_g16.104.cam.h2.PRECT.1990010100Z-2005123118Z.nc') 2019-08-01 00:16:06.104904 xarray.concat(<ALL_MEMBERS>, dim='member_id', coords='minimal')
- important_note :
- This data is part of the project 'Blind Evaluation of Lossy Data-Compression in LENS'. Please exercise caution before using this data for other purposes.
- nco_openmp_thread_number :
- 1
- revision_Id :
- $Id$
- source :
- CAM
- title :
- UNSET
hour = 60*60
precip_in_m = lens.PRECT * (6 * hour)
precip_in_m
- member_id: 36
- time: 23360
- lat: 192
- lon: 288
- dask.array<chunksize=(2, 504, 192, 288), meta=np.ndarray>
Array Chunk Bytes 186.01 GB 222.95 MB Shape (36, 23360, 192, 288) (2, 504, 192, 288) Count 1693 Tasks 846 Chunks Type float32 numpy.ndarray - lat(lat)float64-90.0 -89.06 -88.12 ... 89.06 90.0
- long_name :
- latitude
- units :
- degrees_north
array([-90. , -89.057592, -88.115183, -87.172775, -86.230366, -85.287958, -84.34555 , -83.403141, -82.460733, -81.518325, -80.575916, -79.633508, -78.691099, -77.748691, -76.806283, -75.863874, -74.921466, -73.979058, -73.036649, -72.094241, -71.151832, -70.209424, -69.267016, -68.324607, -67.382199, -66.439791, -65.497382, -64.554974, -63.612565, -62.670157, -61.727749, -60.78534 , -59.842932, -58.900524, -57.958115, -57.015707, -56.073298, -55.13089 , -54.188482, -53.246073, -52.303665, -51.361257, -50.418848, -49.47644 , -48.534031, -47.591623, -46.649215, -45.706806, -44.764398, -43.82199 , -42.879581, -41.937173, -40.994764, -40.052356, -39.109948, -38.167539, -37.225131, -36.282723, -35.340314, -34.397906, -33.455497, -32.513089, -31.570681, -30.628272, -29.685864, -28.743455, -27.801047, -26.858639, -25.91623 , -24.973822, -24.031414, -23.089005, -22.146597, -21.204188, -20.26178 , -19.319372, -18.376963, -17.434555, -16.492147, -15.549738, -14.60733 , -13.664921, -12.722513, -11.780105, -10.837696, -9.895288, -8.95288 , -8.010471, -7.068063, -6.125654, -5.183246, -4.240838, -3.298429, -2.356021, -1.413613, -0.471204, 0.471204, 1.413613, 2.356021, 3.298429, 4.240838, 5.183246, 6.125654, 7.068063, 8.010471, 8.95288 , 9.895288, 10.837696, 11.780105, 12.722513, 13.664921, 14.60733 , 15.549738, 16.492147, 17.434555, 18.376963, 19.319372, 20.26178 , 21.204188, 22.146597, 23.089005, 24.031414, 24.973822, 25.91623 , 26.858639, 27.801047, 28.743455, 29.685864, 30.628272, 31.570681, 32.513089, 33.455497, 34.397906, 35.340314, 36.282723, 37.225131, 38.167539, 39.109948, 40.052356, 40.994764, 41.937173, 42.879581, 43.82199 , 44.764398, 45.706806, 46.649215, 47.591623, 48.534031, 49.47644 , 50.418848, 51.361257, 52.303665, 53.246073, 54.188482, 55.13089 , 56.073298, 57.015707, 57.958115, 58.900524, 59.842932, 60.78534 , 61.727749, 62.670157, 63.612565, 64.554974, 65.497382, 66.439791, 67.382199, 68.324607, 69.267016, 70.209424, 71.151832, 72.094241, 73.036649, 73.979058, 74.921466, 75.863874, 76.806283, 77.748691, 78.691099, 79.633508, 80.575916, 81.518325, 82.460733, 83.403141, 84.34555 , 85.287958, 86.230366, 87.172775, 88.115183, 89.057592, 90. ]) - lon(lon)float640.0 1.25 2.5 ... 356.2 357.5 358.8
- long_name :
- longitude
- units :
- degrees_east
array([ 0. , 1.25, 2.5 , ..., 356.25, 357.5 , 358.75])
- member_id(member_id)int641 2 3 4 5 6 ... 31 32 33 34 35 104
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 104]) - time(time)object1990-01-01 06:00:00 ... 2006-01-01 00:00:00
- bounds :
- time_bnds
- long_name :
- time
array([cftime.DatetimeNoLeap(1990-01-01 06:00:00), cftime.DatetimeNoLeap(1990-01-01 12:00:00), cftime.DatetimeNoLeap(1990-01-01 18:00:00), ..., cftime.DatetimeNoLeap(2005-12-31 12:00:00), cftime.DatetimeNoLeap(2005-12-31 18:00:00), cftime.DatetimeNoLeap(2006-01-01 00:00:00)], dtype=object)
We’ll select the first member for comparison with the ERA5 histogram.
lens_hist = histogram(
precip_in_m.isel(member_id=0).rename("tp_6hr"),
bins=[tp_6hr_bins], dim=["lon"]
).mean(dim=('time'))
lens_hist.data
|
Note that we’re using the aws_client, because LENS is stored in an AWS region.
lens_hist_ = aws_client.compute(lens_hist)
Compare results¶
Let’s plot the histograms for both the ERA5 and LENS data. These are small results so it’s safe to transfer the result from the cluster to the client machine for plotting.
lens_tp_hist_ = lens_hist_.result()
era5_tp_hist_ = era5_tp_hist_.result()
For ERA5:
era5_tp_hist_[: ,1:].plot(xscale='log');
And for LENS:
lens_tp_hist_[: ,1:].plot(xscale='log');
Cleanup¶
Closing the clients will free all our resources.
aws_client.close()
aws.close()
gcp_client.close()
gcp.close()
Behind the Scenes¶
We deployed some infrastructure to make this notebook runnable. In line with one of Pangeo’s guiding principles, each of these technologies has an open architechture.

From low-level to high-level
Terraform provides the tools for provisioning the cloud resources needed for the clusters.
Kubernetes provides the container orchestration for deploying the Dask Clusters. We created kubernetes clusters in AWS’s
us-west-2and GCP’sus-central1regsions.Dask-Gatway provides centralized, secure access to Dask Clusters. These clusters were deployed using helm on two Kubernetes clusters.
Dask provides scalable, distributed computation for analyzing these large datasets
xarray provides high-level APIs and high-performance data structures for working with this data
Intake, gcsfs, s3fs provide catalogs for data discover and libraries for loading that data
Jupyterlab provides a user interface for interactive computing. The client laptop interacts with the clusters through Jupyterlab.
All of the resources for this demo are available at https://github.com/pangeo-data/multicloud-demo.